to prosecute the cancelled subject matter in another 
application. This amendment has been made for the sole purpose 
of advancing the prosecution of this case. 

ORF-R is also known as ORF-F , nef , 3'orf, B, B' , or F gene. 
See Gallo et al., "HIV/HTLV Gene Nomenclature/ 1 Nature , 333, 504 
(1988) (Exhibit 1); and Wain-Hobson et al., "Nucleotide Sequence 
of the AIDS Virus, LAV," Cell, 9-17, 12 (1985) (Exhibit 2). 
in Paper No. 7, the Examiner stated that 

ralpplicant has not demonstrated a 
utility for these sequences as probes. 
How specific are they for detecting HIV 
and distinguishing it from other 
retroviruses, in particular HTLV I and 
HTLV II? 

See page 4, lines 9-11 of Paper No. 7. 

In applicants' Response, filed November 24, 1993, it was 
noted that the claimed nucleic acid has utility as a diagnostic 
probe. See page 7 of the Response, and page 14, line 11 through 
page 15, line 8 of the specification. Gallo et al., cited 
above, notes that nef is not found in HTLV I or HTLV II. Thus, 
applicants' claimed probe has utility as a diagnostic probe 
unique to HIV-1, distinguishable from HTLV I and HTLV II. 

It is courteously submitted that this application is in 
condition for allowance. Reconsideration and reexamination of 
this application, and allowance of the pending claim at the 
Examiner's convenience, are respectfully requested. 
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The Commissioner is hereby authorized to charge any fees 

associated with this Amendment to our Deposit Account 

No. 06-0916. If a fee is required for an Extension of Time 

under 37 C.F.R. § 1.136 not accounted for above, such extension 

is requested and should also be charged to our Deposit Account. 

Respectfully submitted, 
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GARRETT & DUNNER 
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HIV/HTLV gene | 
nomenclature 

Sir— The complexities of the genomes of 
human retroviruses (the human ^T^ell 
leukaemia viruses, HTLV-I and HTLV- 
II, and the AIDS-causing human [im- 
munodeficiency viruses, HIV-1 and Hiv- 
2) are being unravelled at a rapid pace 
which is likely to continue and expand. In 
addition to containing a large ensemble ot 
positive and negative regulatory genes 
that orchestrate virus expression, these 
viruses are also remarkable in that they 
seem to have converged onto paral el 
regulatory pathways. Two of 
tory genes of the immunodeficiency 
viruses are analogous to the two regula- 
tory genes of the leukaemia viruses, 
although their detailed mechanisms of 
action may be quite different. Decipher- 
ing the modes of action of the regulatory 
eenes of these viruses is crucial to the 
understanding of their pathogenesis as 
well as to development of therapeutic 
agents. Because of the tremendous acti- 
vity in this field, more than one name has 
sometimes been given to a single gene and 
the same name may also apply to more 
than one gene. In the interest of the i many 
new investigators entering the field for tne 
first time, we feel it is important that we 
reach a standard nomenclature for all 
known genes of HIV and mVJJe 
propose the scheme outlined in the table. 
v ^ Robert Gallo 

Flossie Wong-Staal 
National Cancer Institute, Mft 
Bethesda, Maryland 20892, USA 

LUC MONTAGNlER 

Department of Virology, 

Instuut Pasteur, 

75724 Paris Cedex 15, France 

r« William A. Haseltine 

Daria Farber Cancer Institute, 
Boston, Massachusetts 021 15, USA 

Mitsuaki Yoshida 

Department ofVlrd Oncology, 
Cancer Institute, Tokyo 170, Japan 



proposed name 
(and derivation) 



HTLV-I and HTLV-D gene* 

tax, (transactivator) 
tax, 

rer,(regulator of expression 
rex, virion proteins) 



HIVl 

tat (transactivator) 

rev (regulator of expression 

of virion proteins) 
W/(virion infectivity factor) 

vpr(R) 

n<f (negative factor) 

vpx(J0(onlyinHIV-2and 
SIV) 



x-lor, p40x, tat t 
fai,, TA 
pp27x, tet 



tat-3, TA 
art, trs 
sor.A, r.Q 



3'orf, B, B, F 



41 , 41 , 42 Transactivator of all viral 



38 


proteins 


27 


Regulates expression 


25 


of virion proteins 


14 


Transactivator of all 




viral proteins 


19,20 


Regulates expression 




of virion proteins 


23 


Determines virus 




infectivity 


? 


Unknown 


27 


Reduces virus express- 


ion, GTP-binding 


16, 14 


Unknown 



HTLV-1,11 




LTR 



HIV-1 





HIV-2 



Ypr and v,x arc "W^tf^^d^^ 

T*~ r^^^It -e^S tSii SXSrL, (ST1AM. SIV) would 



istimating the incubation period for AIDS patients 

As do Medley et al\ we postulate a 
function h(x) which specifies the increase 
over time of the number of HIV-infected 
individuate who eventually develop 
AIDS, and a probability density function 
fis) for the incubation time of those ; indi- 
viduals. The corresponding Ukehhood 
function can be maximized jointly wuh 
respect to h and /. * J»*^ 
depends only on the product of* and AU 
is not possible to estimate either of these 
fuctic^completely; they maybe mdivt- 
dually estimated only up to con^anttcw 
proportionality c and c ^ respectively . 
Nonparametric estimates of the propor- 
tion of eventual AIDS cases that arc diar 



Sir— The nonparametric analyses of the 
data on transfusion-related AIDS con- 
sidered by Medley et aU indicate prob- 
lems of identifiability. With data obtained 
bv retrospective determination of the time 
of infection for diagnosed AIDScases, it is 
only possible to estimate the early part of 
the incubation distribution up to a con- 
stant of proportionality. The same applies 
to the total number of infections by blood 
transfusion before any given time. The 
transfusion data themselves are unable to 
discriminate between high infection rates 
coupled with long incubation times on tne 
one hand, or low infection rates and short 
incubation times on the other. 



nosed within t years of infection, F{t) - 
r rtiridu, are given in the figure for the 
fi^ge grou?s considered by Medley et 
d In this figure we show the estimates of 
F(t) so that for each group, c = F(7.5). For 
the children, the levelling of the estimate 
of FU) by about 3.5 years suggests that the 
whole of the distribution of incubation 
times has been seen; it may then be rea- 
sonable to suppose that c - 1 but , as also 
noted by Medley * a/., a second wave of 
Sutton times that exceed 7 5 years is 
not excluded by these data. For the other 
two ate groups, there is nothing in the 
tnastoioii data themselves to suggest a 
value for c. As a consequence . it is impos- 
sible to place any upper bound on the 
median incubation time. To estimate this, 
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Summary 

The complete 9193-nucleotide sequence of the pro* 
able causative agent of AIDS, lymphadenopathy-asso- 
ciated virus (LAV), has been determined. The deduced 
genetic structure is unique: it shows, in addition to the 
retroviral gag, pol, and env genes, two novel open 
reading frames we call Q and F. Remarkably, Q is lo- 
cated between pol and env and F is halt-encoded by 
the U3 element of the LTR. These data place LAV apart 
from the previously characterized family of human 
T ceil leuhamia/lymphome viruses. 

Introduction 

The recent onset of severe opportunistic infections among 
previously healthy male homosexuals has led to the char- 
acterization of the acquired immune deficiency syndrome 
(AIOS) (Gottlieb et al., 1981; Masur et ai. ( 1981). The dis- 
ease has spread dramatically, and new high-risk groups 
have been identified: patients receiving blood products, 
intravenous drug addicts, and individuals originating from 
Haiti and Central Africa (Piot et al.. 1984). AIDS is a fatal 
disease, and there is at present no specific treatment. The 
causative agent was suspected to be of viral origin since 
the eptdemtoktfcal pattern of AIDS was consistent w.th 
a transmissible disease, and cases had been reported af- 
ter treatment involving uttrafiltered anti-hemophilia prepa- 
rations (Dtfy snd Scott. 1983). A decisive step in AIDS re- 
search was me discovery of a novel human retrovirus 
called tyrnpr«derK)pair^associated virus (LAV) (Barre- 
Sinoussi et aL, 1983). The properties of the virus consis- 
tent with its etiological role in AIDS are: the recovery of 
many independent isolates from patients with AIOS or 
related diseases (Montagnier et al., 1984); high LAV 
seropositive among these populations (8run-Vdz.net et 
al 1984); a tropism and cytopathic effect in vitro for the 
helper/inducer T-lymphocyte subset T4 (Klatzmann et al.. 
1984), also found depleted in viva 

Other groups have reported the isolation of human 
retroviruses, the human T cell leukemia/lymphomaVlym- 
photropic virus type III (HTLV-III) (Popovic et al.. 1984) and 
the AIDS-associated retrovirus (ARV), which display bio- 
logical and seroepidemiologicaJ properties very similar to 
if not identical with those of LAV (Levy et al., 1984; ^P^jf 
et al., 1984; Schupbach et al., 1984). Both LAV and HTLV- 



III genomes have been molecularly cloned (Alizon et al.. 
1984; Hahn et ai.. 1984). Their restriction maps show 
remarkable agreement, including a Hind III restriction site 
polymorphism, bearing in mind the variability of this virus 
(Shaw et al., 1984) and confirming that these two viruses 
represent a single viral lineage. 

In addition to its obvious diagnostic and therapeutic 
potential, the LAV ONA nucleotide sequence is essential 
to an understanding of the genetics and molecular biology 
of the virus and its classification among retroviruses. We 
report here the complete 9193-nucleotide sequence of the 
LAV genome established from cloned proviral DNA. 

Results 

DNA Sequence and Organization of the LAV Genome 

We have reported previously the molecular cloning of both 
cDNA and integrated proviral forms of LAV (Alizon et al., 
1984). The recombinant phage clones were isolated from 
a genomic library of LAV-infected human T-lymphocyte 
DNA partially digested by Hind III. The insert of recom- 
binant phage AJ19 was generated by Hind III cleavage 
within the fl element of the long terminal repeat (LTR). 
Thus each extremity of the insert contains one part of the 
LTR. We have eliminated the possibility of clustered Hind 
Ml sites within R by sequencing part of an UW cDNA 
clone. pLAV 75 (Alizon et al.. 1984), corresponding to this 
region (data not shown). Thus the total sequence informa- 
tion of the LAV genome can be derived from the *J19 
clone. 

Using the M13 shotgun cloning and dideoxy chain ter- 
mination method (Sanger et al.. 1977). we have deter- 
mined the nucleotide sequence of AJ19 insert. The recon- 
structed viral genome with two copies of the R sequence 
is 9193 nucleotides long. The numbering system starts at 
the cap site (see below) of virion RNA (Figure 1). 

The viral (+) strand contains the statutory retroviral 
genes encoding the core structural proteins (gag), reverse 
transcriptase (pol), and envelope protein (env), and two 
extra open reading frames (orf) that we call Q and F (Table 
1). The genetic organization of LAV. 5^TR-gag-pol^-env. 
F-3UR is unique Whereas in all replication-competent 
retroviruses pol and env genes overlap, in LAV they are 
separated by orf Q (192 amino acids) followed by four 
small (<100 triplets) orf. The orf F (206 amino acids) 
slightly overlaps the 3' end of env and is remarkable m that 
it is hal^encoded by the U3 region of the LTR. 

Such a structure clearly places LAV apart from previ- 
ously sequenced retroviruses (Figure 2). The ( -) strand is 
apparently noncoding. The additional Hind ill site ot the 
lav clone AJ81 (with respect to AJ19) maps to the appar- 
entry noncoding region between Q and env (pos.t.ons 
5166-5745). Starting at position 5501 is a secuence 
(AAGCQT) that differs by a single base (underlined) from 
the Hind III recognition sequence. It is anticipated ^at 
many of the restriction site polymorphisms between a .r?er. 
ent isolates will map to this region. 
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UuClfIWlltClaAUClefreA.pLy«S«CluStrC^ 

ClrflfAMCWClaV.lAiDLyaltuV.lS^ 

. ...... 3800 .... 

AL.StrA^btAicltuProFroVilv.uULyiGWIU^^ 
^ACTCA^AACCTCCCACCTCTACTACCAAAAQUUTACTACCCACCTCTCATAA^ 

3900 ...... 

UuAjpCTimtt^LtuCWClylytV.UltL.uViUltV^ 

4Ctacaitctacacatt?agaaccaaaacttatcctgc7accacttcatgtac^ 

4000 ........ 

LytLtuAUClyArtTrpProV.LLyiThrlleUiiThrAipAioClyStrAiaPhtThrStrThrThrViUyfAliAUCyiTrpTfpAUClylULyiClQCUPhtClyLLtPco 
a a AA^TA/^TAf^^^A/t^Tftfy ftTAA aaacaaTacaTacagaCaaTGGCACCaaTTTCAC CACTACTACGCTTAXCGCCCCCTCTTCGTGGGCCCCAATCAACCACCAATTTCGAATTCC 

*ioo ......... 

TyrA*oProClBSttClaClyV«lV.lCUS«rK«tA*QLytCluL«uLy«LyilUntClyCloV«lAr8AtpClQAUCluMiiUuLytThrAUVilClaiUtAUV«lPlitn. 
CTACAATCCCCAAACTCAAGCACTACTA<^TCTATCAATAA^ 

■^»AjoJbtLyiAriLyiGlyClyIliClyClyTyrStrAUGlyGluAr|IU^ 
rr k ^r mA AAA &AA AAG rSAr fa * » * Af A A A TTACAAAA.ATTCA.AAA 

4400 .... 
Pb€A*tV*lTyrTyrAriAipStrAT|A»pProL«uTrpLyiClyProAl«Ly»UuL«uTrpLyaGlyCluClyAUV«lV*lIl«ClaA«pAAfl$«rA*pIULy«Vtl7*lProAx| 

0RF Q—Cy.CUClu 

^TTTCCGCTTTATTACaCCCaCaCCACAGA TC CACTTTC CAV JiC CAC CAGCAAAGCTC CT CTCCaAAC GTCAAX#C4MXaCTACTAATACAACATAATACTCACATAAAACTaGTCC caac 

4500 ...... 

ArsLy»Al«Ly«UtIlcAriA«pTyrGlyLytGla)UcAUClyA*pAtpCysV«UUS«rArtClaA«P^luA*p • 

CLuI.ytClaAri5trL«uGlyIiaSit^luA«oAriTrpCloV*l^ 
A^CAAAA£CAJU^TCAnA£C^mTCCAAAACACATC 

46QC ........ 

GiyLytAlAAxiGlyTrpPbtTyrAriHiiHitTyrG jSttProanProAr$UtS«rS«rCUV»UnIltProUuClyAjpAl*ATtUuV*L UtTbrTbrTyr TrpCl yLeu 
CACCCAAACCTACC<^*ArCCTTTTATACACATCACTATGAA 

4700 ......... 4800 

aiiTbrGlyCluArtA«pTrpatiL«uClyClaClyV.13trU«CUTrpAr|LyiLytAr|Tyr$trThrClaV»lAjpProClva.«uAl4AjpClal.«untiUL«uTyrTyrPh« 
tr^TArA^rA ^ A ^ArA ffAfTftcr^TrTCfV^ 

4900 

4tpCy«Pb«S«rAjpStrAUlltAraly»AUUuUuCiyBi-IUVtH^ 
TTCACTCTTTTI^GACTCTCCTAtAAflAAACCCCiTATTAG 

3000 .... 

l«ulltmProLy»Ly»IWLy»ProProL«uProS«rV*irhrlyiL«um 
ZxTTk A T AA f* rrAAAAAArATAAAr ^ rAr '* mr ^™ Ar ™^ K ^ 

# . . . . 5100 ...... 

ACTACACCTTTTACACCASCTTAACAATCAACCrCTTA^ 

5200 ........ 

CATAATAAtUATTCTCCAACAACTCCTCmATCCATT^ 

5J00 . . . ... . . . . 5400 

AGCCCTCCAACCATCCACCAACTCACCCTAAAACTCCfTGTACGlCTTCCTATTCTAAAAAUlOl ITCATTCCCAAH I 111 1 H A T A A f A A A A fir CTTACCCATCTCCTATCGCA 

5500 

(XAACAACCCCACACACCCACCAACACCTCCTCAACCCACTCACACTCATCAACTTTtT 

5400 .... 

£NV«» Ly«CluCloly»Tbr 

TACTACCAATAATAATAGCAATACTTCTCTCCTCCATACTAATCATACAATATACCAA^ 

5700 ...... 

7.LA1^3Ax|f«lLyfCluLy«TyrClQliiUuTrpAr»TrpClyTrpLytTrpClyTbrM«tUuUttClyU«UttltotIUCy«f«rM«TkrClaLy»L*uTrpV«irbrV.i 

zTr^J^T^r^v: AArr ^ rAAArA ^ A ^ A< ^r^ 

3*00 ........ 

TyrTyrCly?«iProV»lTrpLyiCluAaaI*rThrttrUuFhaCytAlaS«r^ 
tATTATCCCCTACCTCTCTCfiAifinAAyAAf CAf CACTCTATTTTCTCCATC 

5900 _ . ........ 6000 

ProAjoProCloGWVtlV*lUuV»lA*i?«iThrCluA*aP*«^ 
c rr Jur rrAi-Aj^A/ n >i/ n >A^^AAATr^ 

r . . . . , 6100 

Cyi?*llfal^T*irrol^Cy»TalI«fUulysCyiTbrA*pU^ 

TTTf T l ftrtft** AW, rf " " ' " " ^" "^"^^'•^"^^^^"tir /•tAy^A/^AATArrAATA^Ag Ttrrrr^iii TCATCATc rir.iiir.r^r.A C 

6200 



IUlyaAe»*»at«Fa«AMU«Wrl*rWfXlaA*iClyly«V«U^ 
ATAAAAAJ UUU1 I IL AAIATCA C QCAA^ATAACACCTAACCTCCACAAAC 

n .... 6300 . • • • * 

TbrS«rCy«AMlteS«rf*lIltnr«laAi«CytfroLytV«lS«rna«luProU«rroIl*Ii«TyrCy«^ 
ACAACTTCUAjCACCTCA^TCAnACACA^CCCTClCUA^ 



Aj»ClyTkxClyPToCy«mA«?Alt«rT*rT*lClaCysTbrMU^^ 
AATC<AACA6CACaTCTACAMTCTCACCACACTiAUAtCTACA 

. 6500 . . . . j_ . . . 6600 

StrAl*AjaPta«rbrAjpAjaAWLy*T*rIUU«T*lClaJ^uAjQCUS«rW 

TCTCCCAATTTCACACACAATCCTAAAtf A A A A/TTATCCCTATCCACACGCCaC CA 

. . ■ . 6700 

ClyAriAl*Ptatf«lThrtl«ClyLy»lltClyA*»JtatAx|^ 

^^^A^Aftf A ^^TffTTAfA ATAiTCAAAAATAytCAAATA.Tlf?A/?Af A Af A^TffTAAf A^^Ag TA/^A^f AA A AT CilAATCi?fcA.<?T^TAAAA£ACAyACffyACCAAATT i kC*kC*k kC k i T^T 

. ...... 6100 • . • 

ClyAAoA*»lytmil«ntP*tlyiCUS«r$«rClyClyAjpProCWlW^^ 
C&AAAlAATAAAACAATAATCTTTAAflCAATCCTCACCACGCCACCQ 

_ j . . . . 6900 ...... 

ThrTrpPh«A*oS«rThrTrpS«rI*rCUGlyS«rA*aA*oT*rCWCly^ 

acttcctttmtactacttcgactactcaacc^stc^ 

7000 u . . . . 

^iTVrAltProProUtStrtlyCUntArtCy.^rWrAj^^ 

^TCTATCCCCCTCCCaTT Afy fiftAf a A iTTA/^ATeTTr att*a a at a^t*/*a r ry*./~rri~y a^t* a^ a *r a/* a irrfTCTtit t a** a a<*a A^R/^CTf rCA/tA W*T^CAgAgCTC(^£ & 

noo ......... 

A4pJtaULr|AtM"TrpA*gStfCluUtfTyrLytTyrLyam 

^*^-^ ^^^^^^*A Ck k ^^E/iA^A A^^f^A AT^A^A^A A A^A^A A A^^ A^^ a AAA A^^ff A A^^ A^ A^^^^^ ^^^^ A ft *^ A^ A^^ ^^ ^^^ A^ Aff A^ A A A A A A^-AC f A fT^^ 

7300 
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ATCTCTTO-aa :.._ 1 —=r7TC;r,. a K.tCluTrpA*pAr8 



ITCrGCAAAACT jwfl _ ..J... c ,. C l u U»L«CluU«*.pl».ttM».5.^Tt P UanpPh. 



AACAT AACAAAfTC CCTCTCCT AT AT AAAAAT attca i . ii^rtU«V«lA«o«y8«fUu 

^ATcweCAceAtneece^^ - ; eliiC ^^.v.ui.ouv.iv.l 

LlLCCTACOrf r r^^'^ CACCreCCTOTC4CTC * C * M0 ° ^^.il.T^t^rrrfWrroClyniArtTyrfTO 

^^^^ 

Each nud^ki. ^^°"^?Um e»mpo»ion is T. 22 2*.: C. 178%; A. M». ©• 



tS oSn.zatioo o. . reconstruct* LTR ami .r«9 
LTR is a perfect 15 bp polypunne tract. The other three 



oalvDurine tracts observed between nucleotides 
SSSS are not toHowed by a sequence that -s com- 

,aion site established from the sequence of the 3 end o, 
■ ~ lATLntimaA LAV cONA (Alizon et al.. 1984). Thus us 

thesiz^NA^rXAV cONA. After a.Ka.ine hydro* 




gag 
po» 
off Q 
env 
off F 
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^rrrr^ 
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Figur .2. comP^noMneGene^^ 

4 »h. nrimer R+U5 was found to be 181 ± 1 bp (Fig- 

SOTS - « " - 

rln h« located Finally, U3 is 456 bp long. The LAV LTR 

SSS^JSS^ dup.icated in the 3' end of ort 
tmSSSl "n add-on. MMTV and LAV are excep- 
£„2 inThat ft. U3 MM can encode^ an orMn the 
case of MMTV, U3 contains the whol. .ort while, in LAV, U3 
contains 110 codons of the 3* half of ort F. r 



Viral 
0«0 



Near the 8- «Wmity of the gag ort is a typical" inflation 
c^tozak. 1984) (position 336). which is not only the 
STnL g« ort. but the firs. ™J»g££ 
orecursor protein is 500 amino acids long. The calcuiawa 
M c? 55*1 agrees with the 55 Kd gag precursor pory- 
ii^m^m. unpublished resu^The N- 
terminal amino acid sequence of the W <^ J* 0 "" 
tfToLnedt^ 

ZiZmJLm. matches pertedy with thetran* 
nucleotide sequence starting from position 732 (see 

lX genome and the immunologically characterized LAV 
£ protein The protein encode 5' of £Pf 
quenc. is rather hydrophilic. Its calculate* M,of u 866 » 
consistent with that of the gag protein pia The 3 part of 
region probabry cod^tor the rmrewa. nud.« 
acid binding protein (NBP). Indeed, as in HTLV-l (Seik. et 



Figure 3. Sdwn.se Representation of the LAV Long Term.na. Repeal 
tirirn »as reconstructed from the sequence of U19 by |uxtapos.ng 

5 gcX^AV ONA Cone ptAV7 S (Al.zon e, a... ,964, ^ ou 
?hfr««bilitv 0» clustered Hind III sites in the H region ot LAV LTR are 
???™«Md7eoe«l sequence (IR). Both ol tne vral elements 

- ,RNA ,5r ' me, Bina,n9 51,8 

PBS ta 5 IT* aTpWunne tree* (PU) tor 3' LTR. Also mdicated 

andW'yadwylation s..e(CAA». The nation ot the open reading frame 
F (648 nucleotides) • shown above me LTR scheme. 

al., 1983) and RSV (Schwartz et al.. 1983) the .motif Cy* 
X ^vs-X.-.-Cys common to all NBP (Oroszlan et a I 1984) 
fs'Sd d^SSted (nuc.eo.ides 1509 and 1572 in LAV se- 
quence). Consistent with its function the putative NBP ,s 
extremely basic (17% Arg + Lya). 

,o 1003 amino acids (calculated M r - 113.629). Since he 
Methionine codon is 92 triplets from the origin of the 
open reading frame, it is possible that the protein is trans- 
££1. a spliced messenger RNA. giving a gag-pol 

^eTS^n is the only one ,n which sign!,, 
can^Ioni^hasSen found with other retroviral protein 

sCncesXee domains of homology bein* , appa^ 
The firs, is a very short region of 17 amino acds (starting 
T*£\ Homotogous regions are located within the p 5 
^pSS!*J™» Moelling. 1978) and apo£ 
SoMe encoded by an open reading frame located be- 
£ ga^poVo. HTLV-l (Figure 5) (Schwartz et a... 
£?£ et a.^983). This first T^^^ 
respond to a conserved sequence ,n viral P"" 638 ** «» 
dHtonrt locations within the three genomes may not be 
g^^r^roviruses. by -PW^"^ 
Mn rMs aaao-ool polyprotem precursor (Schwanz 
JTSSSSffS The second and most ex- 
fensrve^ioV of homotogy (starting at 2048, probably 
Z£n?fte cor. sequence of the reverse transcnp- 
ZTSTi njgion of 250 amino acids, with only minima. 
SttoZor deletions. LAV shows 38% amino acid iden- 
rSTRSV. 25% with HTLV-l, and 21% wrft MoMuLV 
?&htnk* Z a... 1981) white HTLV-l and RSV show 38% 
K * the mm region. A third homologous reg.n * 
situated at the 3' end of the pol reading frame and corre- 
spond » part of the PP32 peptide of RSV that has ex- 
on^SaseactivitytMisraeta... 1982). 0"£ 
ia greater homology with the correspono.ng RSV se- 
quence than with HTLV-t. 

an v open reading frame has a possible n,„ator 
^n^^nearthebeg.nningte.gntntrip.et). 
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s^nc* shown in Figu* 1. 

P^n (861 ^J^^JgZ (110 M and' 
with the Known ^^^*^^ m unt»t>. 

X-SerfThr). ^«J^*T l ^r of TVp residues at 

2T« a... 1983), corresponding to a ^J»^£ 
SSS by nucleotides 5815-5850 W,a^ ^_ 

P 2Slll efMtoM Iprotein sequence is that th. 
'SSSSHm re**.). T*. er« pm*n show. "° 



has a protease-associated function. 

plains of all leutornogenic retroviruses (Canaolo et al . 
1984) is not present in LAW env. 

tion for the putative protems £ *~ pro . 
tein sequence data banter ^^Thon^om. Further- 

^ h2STs iowwn that rrtroviruses can 
completely proto^ncogenes 
transduce cellular «^^™rLoandF represent 

S^mJvT^ <*~ not hybridize to the ho- 
JJS and env genes (data not shown). 



R^r-hjf S^SSSy and biochemfcaHy 
Although LAV is both morpnowgicwiy « 
;«w:sinoussi et al.. 1983) distinct to HTUW and r» re- 
(Barre-Sinousw .mi * . <^ organized in a sim.- 
m(U ned poss-ble ^^^XTof HTLV-I and -.. 
,af ^rj^^th^rnoredistantty related 

genomes, which they sna/i £™L tf l984) are not 

observed m the case or u» m)ncodin g stretch 
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Table 2. Coinparison or the Size ot the LAV LTR ana LTR-flelated 
Element tt Those ot Other Retroviruses 



LTR 



U3 



LAV 
HTLV-1 
HTLV-II 
MMTV 
MoMuLV 
RSV 335 
SNV 601 



638 456 

759 355 

763 314 

1.332 1.197 



594 



449 
234 
420 



97 



11 
68 
21 
97 



U5 


PU 


PBS 


IR 


85 


15 


LYS 


4 


176 


12* 


PRO 


4' 


261 


i? 


PRO 


4' 


124 


19 


LYS 


8' 


77 


13 


PRO 


13 


80 


11 


TRP 


15 


80 


13 


PRO 


9 



Adapted from Chen and Barker (1984). 
» . imperfect match or tract. 

SNV - spleen necrosis virus (Shimotonno and Temin. irazi. 



composed of unusually long U5 and R elements and the 
polyadenylation signal being situated in U3 instead of R 
(Seiki et al., 1983; Sagataet al.. 1984; Shimotohono et al 
1984) We show here that, in contrast, the 3' end of the LAV 
envelope gene overlaps an open reading frame, termed F. 
that has the coding capacity tor 206 amino acids and ex- 
tends within the UR (110 amino acids are encoded by the U3 
region). The putatively encoded polypeptide <pF), the pri- 
mary structure of which can be deduced, does not show 
any homology with the theoretical X gene products of the 
HTLV/BLV family. Also, the U5 and R elements are shorter 
(Table 2) and the polyadenylation signal is located within R, 
as is the case for all retroviruses except the HTLV/BLV Ad- 
ditionally, LAV uses tRNA? as (-) strand primer, as op- 
posed to tRNAP™ employed by all other mammalian retro- 
viruses except MMTV (Donehower et al.. 1981). Those 
homologies detected between the polymerase and pro- 
tease domains of LAV and HTLV are also found in several 
retroviruses, RSV in particular. 

It has been reported that a doned HTLV-lll genome 
hybridizes <T m - 28-C) to sequences in the gag-pol and 
X regions of HTLV-I and -II; although restriction maps of 
cloned LAV and HTLV-lll show almost perfect agreement 
(Hahn et al.. 1984), we were unable to detect any such 
hybridization between LAV and HTLV-II (T m - $S*C) 
(Alizon et al., 1964). Indeed, there is a punctual region of 
homology between LAV and HTLV-I (23/ZT nucleotides 
starting tfpotMon 1859 in the LAV sequence) but nothing 
signiflevt petweon the two viruses in the X region of 
HTUR One possible reason for this discrepancy is that 
HTLVMRfesuMy different from LAV. However it was sub- 
sequent* reported that there was very minimal, if any. ho- 
mology between orl X (of HTLV-I) and HTLV-lll (Shaw et al., 
1984). 

Discussion 

Regulatory sequences carried by retroviral LTR are be- 
lieved to be involved in specific interactions between the 
viral genome and the host cell (Srinivasan et al.. 1984). 
The UTR sequences of LAV are unique among retrovi- 
ruses. That could reflect an original mode of gene ex- 
pression, possibly in relation to particular transcriptional 
factors present in the virus-harboring cell. This hypothesis 
can be tested by studying the regulatory activity of the LAV 



LTR sequences in transient or long-term experiments in- 
volving an indicator gene and different cellular contexts. 

The presence of the Q and F reading frames in addition 
to the conventional gag-pol-env set of genes is unex- 
pected. One should now address the question of their role 
in the viral cycle and pathogenicity by trying to character- 
ize their protein product(s). It is tempting to speculate on 
a role of such polypeptide^) in T4 cells' mortality, a prob- 
lem that can be studied by designing synthetic peptides 
for antibody production or by using site-directed mutagen- 
esis of Q and F coding regions. 

The peculiar genetic structure of LAV poses the ques- 
tion of its origin. The virus shares common tracts with other 
(apparently unrelated) retroviruses. For instance, the un- 
usually large size of the outer membrane glycoprotein 
(env) and a comparably sized genome are also observed 
in the case of lentiviruses such as Visna (Harris et al.. 
1981; Querat et al., 1984). The presence of a large part of 
the F open reading frame in the LTR, and the use of 
tRNA* as a primer for (-) strand synthesis, is reminis- 
cent of the mouse mammary tumor virus. On the other 
hand, homologies in the pol gene would suggest that the 
LAV is closer to RSV than to any other retroviruses. Obvi- 
ously, no clear picture can be drawn from the DNA se- 
quence analysis as far as phytogeny is concerned. Thus, 
it may well be that LAV defines a new group of retroviruses 
that have been independently evolving for a considerable 
period of time, and not simply a variant recently derived 
from a characterized viral family. Both epidemiology and 
pathogeny of AIDS should be reconsidered with this idea 
in mind, when trying to answer such questions as these: 
Are there other human or animal diseases that are as- 
sociated with similarly organized viruses? Is there a precur- 
sor to AIDS-associated virus(es) normally present, in la- 
tent form, in human populations? What triggered m this 
case the recent spreading of pathogenic derivatives? 



M13 Ctonlno and Sequencing 

-fatal AJ19 DNA was sonicated, treated with the Kienow fragment ot 
DNA polymerise plus deoxynbonucleotides (2 hr. ie«C). ana 

ISlctroe^. and punfied by Butip (^^ScM 
chromatography. DNA wee ethanol-precipitated using 10 *g dextran 
7^p3^) » camef and lip^ to o^ 

and translecied into E. Co* stiain TQ-I. Recombinant ^J?^***^*^ 
tected by plaque hybridixation using the appropnats labeled LAV 

p^d from plaque, exhtbrting poe*ve W**™ ™ 

siouincedbytrwdideoxycr^ 

^tS7«S^TP (Amersham. 400 Ci/mmol) and Out* gradffrt 
oWtm et a).. «*3). Sequences were compiled and analyzed 
SflEZw of Staden adapted by a Caudron tor in. .nautut 
Pasteur Computer Center (Staden. 19SZ). 



Strong-StoP cOHA 

LAV virions from inl 
culture supernatant were 
the cONA(-) strand was 
etai.. 1964) except that 
hydro*yw<0JMNaOH 
traction, the cONA was 



„ T lymphocyte (BarTe*Sinousa» tt *» '9S3) 
\ p^icted through a 20% sucrose cu»w and 
j synthesized as described pre***** ■ *"«" 
noejejaenouaprimarwasused 
30 mm. 65CC). neutralization *no o^o* e* 
#t hanoK)nKX>rtated and toeoed x*o « 6** 
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acrylam.de/6 M urea ••quencing 9* wnh sequence ladders as size 
markers. 
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